United States Patent 


Blomgren et al. 


| 
US005685009A 


5,685,009 
*Nov. 4, 1997 


[11] Patent Number: 
[45] Date of Patent: 


[54] SHARED FLOATING-POINT REGISTERS 
AND REGISTER PORT-PAIRING IN A DUAL- 
ARCHITECTURE CPU 


[75] Inventors: James S. Blomgren, San Jose; David 
E. Richter, Milpitas; Cheryl Senter 
Brashears, Cupertino, all of Calif. 


[73] Assignee: Exponential Technology, Inc., San 
Jose, Calif. 


[*] Notice: The term of this patent shall not extend 


beyond the expiration date of Pat. No. 
5,481,693. 


[21] Appl. No.: 564,719 
[22] Filed: Nov. 29, 1995 
Related U.S. Application Data 


[63] Continuation-in-part of Ser. No. 277,962, Jul. 20, 1994, Pat. 


No. 5,481,693. 

P51] MM Ue acct cetas acces aeeooLevctceacdsca aS cescecaey GO06F 9/30 
[52] US. C1. we ececssssscssessetennee 395/800; 395/566; 395/500; 
395/568; 395/587 
[58] Field of Searela 2.0... sscssccsssseetsseees 395/800, 500, 
395/566, 568, 587 

[56] References Cited 

U.S. PATENT DOCUMENTS 

4,412,282 10/1983 Holden .......cssecvsesccsesnscserseensee 364/200 
4,633,417 12/1986 Wilburn et al. ou... .cccsecseeeee 364/550 
4,763,242 8/1988 Lee et al... ccscsscecccrsseeneseeee 395/500 
4,780,819 10/1988 Kashiwaki et al. .....ccsssecsssoeee 395/500 
4,794,522 12/1988 Simpson ....s.sessseesecceasersnrscesee 395/500 
4,812,975 3/1989 Adachi et al. -- 395/500 
4,821,187 4/1989 Ueda etal. ... 395/591 
4,841,476 6/1989 Mitchell et al. 395/500 
4,942,519 7/1990 Nakayama ........ 395/290 
4,972,317 11/1990 Buonomo et al. 395/568 
4,992,934 2/1991 Portanova et al. .... 395/385 


5,077,657 12/1991 Cooper et al. .... 


5,097,407 3/1992 Hiho et al. ........ ; - 395/385 
5,136,696 8/1992 Beckwith et al. ..........ccssssese 395/587 
5,167,023 11/1992 de Nirolas et all. ....sccssecsecesreee 395/527 


5,210,832 5/1993 Maier et al. ......csosccosseesarsseeeee 395/568 
(List continued on next page.) 
OTHER PUBLICATIONS 


Hayashi et al., “A 5-6 MIPS Call—Handling Processor for 
Switching Systems,” IEEE Journal of Solid-State Circuits, 
vol. 24, No. 4, Aug. 1989, pp. 945-950. 

Garth, “Combining RISC and CISC in PC Systems,” IEE, 
Nov. 1991, pp. 10/1 to 10/5. 


Primary Examiner—Alyssa H. Bowler 
Assistant Examiner—John Follansbee 
Attorney, Agent, or Firm—Stuart T. Auvinen 


[57] ABSTRACT 


A dual-instruction-set central processing unit (CPU) is 
capable of executing floating point instructions from a 
reduced instruction set computer (RISC) instruction set and 
from a complex instruction set computer (CISC) instruction 
set. Floating point data is transferred from a CISC program 
to a RISC program running on the CPU by using shared 
floating point registers. The architecturally-defined floating 
point registers in the CISC instruction set are merged or 
folded into some of the architecturally-defined floating point 
registers in the RISC architecture so that these merged 
registers are shared by the two instructions sets. In 
particular, the floating-point exception-mask and flags reg- 
isters defined by each architecture are merged together so 
that CISC instructions and RISC instructions implicitly 
update the same merged flags register when executing 
floating point instructions. The RISC and CISC registers are 
folded together so that the CISC flags and RISC flags with 
the same function are merged to the same register bit. The 
floating-point data registers are also merged together, allow- 
ing a CISC program to pass floating-point data to a RISC 
program merely by writing one of its floating-point data 
registers, switching control to the RISC program, and the 


- RISC program reading one of its floating-point data registers 


that is merged with and corresponds to the CISC floating- 
point data register that was written to by the CISC program. 
An extended-precision CISC data format is supported by 
pairing two of the RISC-size floating-point data registers. 


19 Claims, 8 Drawing Sheets 
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SHARED FLOATING-POINT REGISTERS 
AND REGISTER PORT-PAIRING IN A DUAL- 
ARCHITECTURE CPU 


BACKGROUND OF THE INVENTION— 
RELATED APPLICATION 


This application is a continuation-in-part of application 
for a “Shared Register Architecture for a Dual-Instruction- 
Set CPU”, filed Jul. 20, 1994, U.S. Ser. No. 08/277,962, now 
USS. Pat. No. 5,481,693, hereby incorporated by reference. 
This related application has a common inventor and is 
assigned to the same assignee as the present application. 


BACKGROUND OF THE INVENTION—FIELD 
OF THE INVENTION 


This invention relates to computing hardware, and more 
particularly to the architecture of floating point registers in 
a processor capable of executing from two instruction sets. 


BACKGROUND OF THE INVENTION— 
DESCRIPTION OF THE RELATED ART 


Processors, or central processing units (CPU’s) that are 
capable of executing instructions from two separate instruc- 
tion sets are highly desired at the present time. For example, 
a desirable processor executes user applications for the x86 
instruction set and the PowerPC™ instruction set. It is able 
to execute the tremendous software base of x86 programs 
that run under the DOS™ and WINDOWS™ operating 
systems from Microsoft of Redmond, Wash., and it could 
run future applications for PowerPC™ processors developed 
by IBM, Apple, and Motorola. 

Such a processor is described in the related copending 
application for a “Dual-Instruction-Set Architecture CPU 
with Hidden Software Emulation Mode”, filed Jan. 11, 1994, 
U.S. Ser. No. 08/179,926. That dual-instruction-set CPU has 
a pipeline which is capable of executing instructions from 
either a complex instruction set computer (CISC) instruction 
set, such as the x86 instruction set, or from a reduced 
instruction set computer (RISC) instruction set, such as the 
PowerPC™ instruction set. 

Two instruction decode units are provided so that instruc- 
tions from either instruction set may be decoded. Two 
instruction decoders are required when the instruction sets 
are separate because the instruction sets each have an 
independent encoding of operations to opcodes. For 
example, both instruction sets have an ADD operation or 
instruction. However, the binary opcode number which 
encodes the ADD operation is different for the two instruc- 
tion sets. In fact, the size and location of the opcode field in 
the instruction word is also different for the two instruction 
sets. In the x86 CISC instruction set, the opcode 03 hex is 
the ADD 1,v operation or instruction for a long operand. This 
same opcode, 03 hex, corresponds to a completely different 
instruction in the PowerPC™ RISC instruction set. In CISC 
the 03 hex opcode is an addition operation, while in RISC 
the 03 hex opcode is TWI—trap word mediate, a control 
transfer instruction. Thus two separate decode blocks are 
necessary for the two separate instruction sets. 

Programs may run in either or both instruction sets. Data 
and other information may be shared between RISC pro- 
grams and CISC programs. One way to share data and other 
information is to store the data in a register within the CPU 
before switching to the alternate instruction set, and making 
registers readable by either instruction set. Unfortunately, 
this requires that the instruction sets be extended to provide 
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2 
instructions to read the additional registers. The shared data 
could also be saved to a stack in memory, but this decreases 
performance due to the time required to transfer the data to 
memory and to adjust the stack pointers. 

Two sets of registers could be provided; one set for the use 
of CISC programs and a second set for the use of RISC 
programs. This is an expensive approach since the registers 
reside on the CPU die, which has a limited space available 
for registers. The additional registers require increasing the 
size of the CPU die, or deleting another function such as 
floating point processing. 

Floating point registers should also be shared. Floating- 
point numbers include a fraction or mantissa portion and an 
exponent portion. Wider formats for floating point data are 
typically used than for integer data. Often 64 or 80-bit 
formats are used for floating point numbers, while a 32-bit 
format is used for integers. Also, CISC may support a wider 
floating point format of 80 bits while RISC only provides for 
a 64-bit format. A 128-bit extended format is known in the 
art for both RISC and CISC. Separate status flags and mask 
or enable bits are provided for floating point operations. 

What is desired is a way to share some of the registers 
between a CISC and a RISC architecture on a dual- 
instruction-set CPU. It is further desired to have shared 
tegisters for data and system information. The shared reg- 
isters should not be extra registers in addition to the registers 
already defined by the CISC or RISC architectures, but 
should be registers already existing in the architectures. The 
shared registers must not cause conflicts between use in the 
two instruction sets or other undesirable effects. It is also 
desired to share floating point data, status, and control 
registers between the two architectures. 


SUMMARY OF THE INVENTION 


Certain CPU registers defined by a RISC and a CISC 
architecture are shared. CISC and RISC programs may alter 
and read these shared registers, allowing data and system 
information to be exchanged between programs running in 
the two instruction sets. 

A shared register system for a dual-instruction-set pro- 
cessor has a shared register for storing information to be 
transferred between a first program comprised of instruc- 
tions from a first instruction set and a second program 
comprised of instructions from a second instruction set. The 
first instruction set has a first encoding of operations to 
opcodes, while the second instruction set has a second 
encoding of operations to opcodes. The first encoding of 
operations to opcodes is substantially independent from the 
second encoding of operations to opcodes. 

A first means is for accessing the shared register from the 
first instruction set. The first means writes information into 
the shared register responsive to a first subset of instructions 
from the first instruction set. A second means is for accessing 
the shared register from the second instruction set. The 
second means reads information from the shared register 
responsive to a second subset of instructions from the 
second instruction set. : 

The invention allows information to be transferred from 
the first program to the second program using the shared 
register. In other aspects of the invention, the shared register 
may be any one of the general-purpose registers accessible 
to both instruction sets, while the source and destination 
fields in the instruction words specify which general- 
purpose register to access. In still further aspects of the 
invention, the shared register is the flags register which 
stores flags or condition codes that are implicitly written by 
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arithmetic-logic-unit (ALU) operations. Although the shared 
flags register contains a first flags field for flags from the first 
instruction set and a second flags field for the flags from the 
second instruction set, either instruction set can access the 
flags in the shared register regardless of which instruction 
set the flags are from. 


BRIEF DESCRIPTION OF THE DRAWINGS 


FIG. 1 is a diagram of a RISC register set. 

FIG. 2 is a diagram of a CISC register set. 

FIG. 3 is a diagram of a CISC condition flag register and 
a RISC condition register. 

FIG. 4 is a diagram showing that the RISC floating point 
status and control register (FPSCR) and the CISC floating 
point status register and CISC floating point control register 
may be combined. 

FIG. 5 shows in detail how RISC floating point status and 
control register (FPSCR) and the CISC floating point status 
register and CISC floating point control register are com- 
bined. 

FIG. 6 is an implementation of RISC floating point data 
register pairing for CISC extended precision. 

FIG. 7 shows floating point data register file having two 
write and four read ports. 


FIG. 8 shows shared registers in a dual-instruction-set: 


CPU. 
DETAILED DESCRIPTION 


The present invention relates to an improvement in float- 
ing point processor architecture. The following description 
is presented to enable one of ordinary skill in the art to make 
and use the invention as provided in the context of a 
particular application and its requirements. Various modifi- 
cations to the preferred embodiment will be apparent to 
those with skill in the art, and the general principles defined 
herein may be applied to other embodiments. Therefore, the 
present invention is not intended to be limited to the par- 
ticular embodiments shown and described, but is to be 
accorded the widest scope consistent with the principles and 
novel features herein disclosed. 


This application is related to the copending application for 
a “Dual-Instruction-Set Architecture CPU with Hidden Soft- 
ware Emulation Mode”, filed Jan. 11, 1994, U.S. Ser. No. 
08/179,926 hereby incorporated by reference. 

A dual-architecture central processing unit (CPU) is 
capable of operating in three modes—RISC mode, CISC 
mode, and emulation mode. A first instruction decoder 
decodes instructions when the processor is in RISC mode, 
while a second instruction decoder decodes instructions 
while the processor is in CISC mode. Two instruction 
decoders are needed since the RISC and CISC instruction 
sets have an independent encoding of instructions or opera- 
tions to binary opcodes. 

The third mode of operation, emulation mode, also uses 
the first instruction decoder for RISC instructions, but emu- 
lation mode executes a superset of the RISC instruction set. 
Using emulation mode, individual CISC instructions may be 
emulated with RISC instructions. Thus, not all CISC instruc- 
tions need to be directly supported in the CPU’s hardware. 
Unsupported CISC instructions cause a jump to an emula- 
tion mode routine to emulate the unsupported CISC instruc- 
tion. Upon completion of the emulation mode routine, 
control is returned to the CISC program with the next CISC 
instruction. 
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RISC INSTRUCTIONS NEED ACCESS TO CISC 
REGISTERS 


Emulation of CISC instructions with RISC instructions 
creates a need for the RISC instructions to have access to 
CISC registers. For example, a CISC branch instruction may 
be emulated by an emulation routine of RISC instructions. 
The CISC branch instruction may be a conditional branch 
that only branches if a certain bit in a condition code register 
is set, perhaps by a previous CISC instruction. Since the 
CISC condition code register is part of the CISC 
architecture, but not the RISC architecture, the condition 
code register is not visible to the RISC instructions in the 
emulation routine. However, the RISC emulation routine 
must have access to this CISC condition code register to 
determine if the branch should be taken. 

The more complex CISC floating point instructions may 
also be emulated by a routine of RISC floating point 
instructions. These RISC floating point instructions thus 
have to be able to access the CISC floating point data 
registers. The CISC control and status registers must also be 
accessed by the RISC floating point instructions to deter- 
mine if an exception has been enabled, and to set the status 
flags depending upon the result of the RISC floating point 
operations and the CISC enable bits. For example, the CISC 
enable bits may enable exception reporting for divide-by- 
zero, but not for overflow. 

The RISC instructions emulating the complex CISC 
instruction must first read the divide-by-zero and overflow 
enable bits in the CISC control register, and then set the 
divide-by-zero or overflow flags in the CISC status register 
if a divide-by-zero or overflow exception occurs when the 
RISC instructions are executed when the divide-by-zero or 
overflow bit is enabled in the CISC control register. 


RISC REGISTER SET 


FIG. 1 is a diagram of a register set for a RISC architec- 
ture such as the PowerPC™. Registers that are visible to a 
user program are shown as user register space 10. Supervi- 
sory programs such as operating systems are able to see all 
of the registers in the user register space 10 and the registers 
in the supervisor’s register space 12. The user registers 
include general-purpose registers 14 which are used by 
programs for temporary storage of operands and results, and 
for address formation. Floating point data registers 16 are 
provided for storing floating point numbers that a numeric 
processor operates on. Condition register 20 contains con- 
dition codes set by various instructions and is useful for 
setting and checking conditions for conditional branch 
instructions. Integer exception register 18 contains bits that 
are versions of overflow bits. It contains information on 
overfiows and carries that occurred in an arithmetic-logic- 
unit (ALU) when the instruction was executed. Link register 
22 contains the branch target address when a special branch 
to link register instruction is executed. Count register 24 
holds a value for a loop count which can be decremented, 
providing a simple way of programming loops. 

A supervisory program such as an operating system has 
access to additional registers in the supervisor’s register 
space 12. Supervisor general-purpose registers 26 are for 
general use by the supervisory program. Segment registers 
28 and block-address translation registers 32 are for address 
translation functions. Machine state register 36 defines the 
state of the processor, including reset, and CISC/RISC/ 
emulation mode. Machine state register 36 contains a 
privilege-level bit, the PR bit, to indicate if the processor is 
running in user or supervisor mode when RISC mode is 
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active. An additional bit, the xE bit, is included in machine 
state register 36 to indicate CISC and emulation modes. The 
xE bit and the PR bit are encoded as shown in Table 1. 


TABLE 1 
Machine Status 
xE bit PR bit Processor Mode 
0 0 RISC Supervisor 
0 1 RISC User 
1 a x86 Emulation 
1 1 x86 CISC 


Machine status save restore 0 register 30 saves the effective 
address of the instruction following the instruction causing 
an exception or a system call instruction. Machine status 
save restore 1 register 34 saves part of machine state register 
36 and other information on the cause of an exception when 
an exception occurs. In particular, machine status save 
restore 1 register 34 receives the PR bit and xE bit from the 
machine state register 36. Machine status save restore 1 
register 34 thus saves the mode the processor was in at the 
time of an exception or other event, and is used to restore the 
processor to that mode when exception processing is com- 
plete. A return from interrupt (rfi) instruction at the comple- 
tion of the exception processing restores the xE and PR bits 
to the machine status register 36 from machine status save 
restore 1 register 34. Together, machine status save restore 
registers 30, 34 save the state of the processor when an 
exception occurs, allowing the system to return to the user 
program once the exception handler routine is completed. 
Hardware-specific registers 38 contain miscellaneous 
implementation-specific information such as extended fea- 
tures. 

Floating-point status and control FPSCR register 52 con- 
tains control or enable bits which enable reporting of dif- 
ferent types of exceptions that can occur during floating 
point operations. FPSCR register 52 also contains flags or 
status bits that are set by the floating point operation, such 
as Zero or negative result flags and overflow flags. 


CISC REGISTER SET 


FIG. 2 is a diagram of a register set for a CISC architec- 
ture such as the x86 used in microprocessors by Intel 
corporation of Santa Clara, Calif., Advanced Micro Devices 
of Sunnyvale, Calif., and Cyrix Corporation of Richardson, 
Tex. Registers that are visible to a user program are shown 
as user register space 11. Supervisory or system-level pro- 
grams such as operating systems are able to see all of the 
registers in the user register space 11 and the registers in the 
system-level register space 13. The user registers include 
general-purpose registers 15 which are used by programs for 
temporary storage of operands and results, and for address 
formation. Segment registers 17 are provided for generating 
linear addresses. Floating point data registers 50 store data 
in a floating-point format, including the fraction or mantissa 
portion and the exponent and sign. Flags register 21 contains 
flags or condition codes set by various instructions and is 
used for setting and checking conditions for conditional 
branch instructions. Instruction pointer 19 contains the 
address of the instruction currently being executed. 

Floating-point control register 55 contains control or 
enable bits which enable reporting of different types of 
exceptions that can occur during floating point operations. 
Floating point status register 57 contains flags or status bits 
that are set by the floating point operation, such as zero or 
negative result flags and overflow flags. 
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A system-level program such as an operating system has 
access to additional registers in the system-level register 
space 13. Control registers 31, 35, 39 define the state of the 
processor, including protected or real modes, exception 
handling, cache enabling, and contain the base address of 
page tables. Breakpoints and control, performance 
monitoring, real-time clocks and other control may also be 
included in control registers 31, 35, 39. Floating point 
instruction pointer register 38 contains the address of the 
instruction last executed by the floating point unit. It must be 
saved because a floating point exception may be signaled 
after other instructions pass through the integer pipeline. It 
allows an exception handler routine to find the exact address 
causing a floating point exception. 


CONDITION CODE/FLAGS REGISTER 


Condition codes are employed by both RISC and CISC 
instruction sets. Instructions that use the arithmetic-logic- 
unit (ALU) or floating point unit (FPU) may produce a result 
having a zero or negative value. These instructions cause a 
flag or condition code in register 21 or 20 to be set when the 
result is zero or negative. Floating-point instructions may 
also set a zero flag in status register 57 or FPSCR register 52. 
Iterative loops may be programmed using such flags. For 
example, a simple loop may execute a series of instructions 
and decrement a loop variable each time the series of 
instructions in the loop is executed. The loop variable is 
initially set to the number of times to execute the loop. At the 
end of the loop, an ALU instruction subtracts one from the 
loop variable. When the loop variable becomes zero, the 
zero flag is set. A conditional branch instruction checks the 
zero flag and exits the loop when zero is reached. 

Many other flags may be defined. For example, the x86 
CISC EFLAGS register defines the flags of Table 2 that are 
set or cleared by ALU instructions depending on the result 


_of the instruction. 


TABLE 2 
CF Carry Flag Set if carry-out or borrow 
PF Parity Flag Set if low 8 bits have even parity 
AF Auxiliary (2nd) Carry Bit 3 carry-out, used for BCD 
ZF Zero Flag Set if all bits are zero 
SF Sign Flag Set if highest-order bit is one 
DF Direction Flag Incr. or Decr. Addr. (String Instr) 
OF Overflow Flag Signed overflow to highest bit 


Other bits in the x86 EFLAGS register are not flag bits set 
by operations but are control bits that define how the 
processor operates. Table 3 shows these control bits. 


TABLE 3 

TF Trap Flag Trap after next instruction 
IF Interrupt enabled Flag Enables external interrupts 
IOPL = Input/Output Privilege level © Max. Privilege for I/O instr. 

(2 bits) 
NT Nested Task Flag Nested task being executed 
RF Resume Flag Resume after breakpoint 
VM Virtual Mode Virtual 8086 mode executing 
AC Memory Alignment Check Mis-aligned data will fault 


The dual-instruction-set processor directly executes only 
the simpler CISC instructions. Many of these simpler CISC 
instructions set or clear the flag bits in Table 2. However, the 
control bits in Table 3 are set or cleared by complex or 
infrequently used CISC instructions such as privileged 
instructions. These instructions are therefore emulated. Only 
the simple CISC instructions modify the flag bits in the 
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7 
CISC EFLAGS register. Emulated instructions modify the 
control bits in the CISC EFLAGS register. 

FIG. 3 shows the CISC flags register 21 and the RISC 
Condition register 20. In the PowerPC™ RISC architecture, 
flags or condition codes are kept in the condition register 
(CR) 20. Condition register 20 is a 32-bit register divided 
into eight 4-bit fields, CRO to CR7. Most RISC integer 
instructions generate the four bits in CRO, but do not modify 
any bits in fields CR1 to CR7. Table 4 shows the meanings 


of the four bits in CRO. 
TABLE 4 
RISC CRO field 
CRO bit Name Description 
0 Negative Result is negative 
1 Positive Result is positive and not zero 
2 Zero Result is Zero 
3 Overflow Overflow has occurred 


Floating point operations do not modify the bits in the 
CRO field, but they do modify the four bits in the CR1 field. 
Table 5 shows the definitions of the four bits in CR1 set by 
floating point operations. 


TABLE 5 
RISC CRI field 
CR1 bit Name Description 
O FP exception Floating point exception has occurred 


1 FP enabled exception A floating point enabled exception has 
occurred 

2 ‘FP invalid exception An invalid floating point exception has 
occurred 


3. ‘FP Overflow Floating Point Overflow has occurred 


ARISC compare instruction can set bits in any of the 4-bit 
fields CRO-CR7. Table 6 shows the definitions for the four 


bits in any field CRn set by the RISC compare instruction. 
TABLE 6 
RISC CRn field set by Compare Instruction 
CRnu bit Name Description 
O Less Than register A is less than register B or immediate 
value from instruction word 


1 ‘Greater Than register A is greater than register B or 


immediate value from instruction word 


2 Equal To register A is equal to register B or immediate 
value from instruction word 
3. Overflow Copy of the Overflow bit in XER register 


The compare instruction specifies which field to write its 
result to. Likewise, the RISC branch instructions can 
specify, as a condition for branching, any bit in any of the 
fields CR® to CR7. Thus the programmer may write condi- 
tion codes to the other six fields in the CR register 20. The 
programmer may later use these other fields with the branch 
instruction using any of the bits in any of the fields CRO to 
CR7. RISC move instructions may also load bits into any of 
the fields CRO to CR7 of the CR register 20. The RISC move 
instruction may move bits from another register, or from one 
4bit CR field to another field within CR register 20. A mask 
may be specified in the move instruction word to indicate 
which bits to move and which bits to not modify. A wealth 
of RISC logical instructions are provided that specify as 
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inputs one or two bits in any of the 4-bit fields. A Boolean 
logical function is performed on the specified input bits, and 
the resulting output bit is written to any bit in any of the 4-bit 
fields in CR register 20. Thus RISC provides a variety of 
instructions to update, modify, and perform logical functions 
on parts of the CR register 20. 


CISC AND RISC FLAGS REGISTERS MAY BE 
COMBINED 


Although fields CR2 to CR7 may be updated in a variety 
of ways by RISC instructions, the inventors have recognized 
that fields CR2 to CR7 may be infrequently updated while 
fields CRO and CR1 are typically updated frequently. A 
programmer has to explicitly decide to update fields CR2 to 
CR7, while fields CRO and CR1 are implicitly updated by 
many RISC instructions. 


The inventors have also recognized that most CISC 
instructions update bits 0 to 11 in the CISC EFLAGS 
register, while few CISC instructions Update bits 12 to 31 in 
the CISC EFLAGS register. FIG. 3 compares the CISC flags 
register 21 and the RISC Condition register (CR) 20. FIG. 3 
shows that simple CISC integer instructions update flags in 
bit-positions 0 to 11 of CISC EFLAGS register 31, while 
RISC integer instructions update bits in the 4-bit CRO field. 
RISC floating point instructions update field CR1, while few 
RISC instructions update fields CR2 to CR7. If a RISC 
programmer can avoid using fields CR5 to CR7 in RISC CR 
register 20, which correspond to bit-positions 11 to 0 in the 
CISC EFLAGS register 21, then the CISC EFLAGS register 
21 can be folded into or combined with the RISC CR register 
20. Since the RISC programs that share data with CISC 
programs are typically RISC emulation routines, the RISC 
programmer is aware of these limitations and can avoid 
using fields CR5 to CR7. Standard RISC user programs that 
do not avoid using fields CR5 to CR7 are not able to take 
advantage of the data sharing features of the invention, but 
they are still able to take advantage of the cost savings of the 
invention because fewer registers are needed on the micro- 
processor. However, x86 CISC emulation routines written in 
RISC code greatly benefit by both sharing data using the 
shared registers and by cost savings. 

Complex CISC instructions modify the control bits of 
Table 3, which are in bit-positions 12 to 21 of EFLAGS 
register 21. These complex CISC instructions may be emu- 
lated with RISC instructions in the emulation mode of the 
dual-instruction-set processor. These control bits may be 
stored in memory rather than in EFLAGS register 21, 
freeing up these bits for use by CR2 to CR4. 


EMULATION OF CISC ENHANCED BY 
COMBINED FLAGS REGISTER 


The CISC EFLAGS register 21 and the RISC CR register 
20 are combined into a single 32-bit register in the dual- 
instruction-set processor. When a complex CISC instruction 
that updates a control bit in the EFLAGS register 21 is 
emulated, the RISC instructions in the emulation routine 
merely have to update the corresponding bit in one of the 
fields CR2 to CR4 in the RISC CR register 20, because the 
RISC CR register 20 and the CISC EFLAGS register 21 are 
the same shared register. For example, a complex CISC 
instruction ASCIE Adjust for Add (AAA) performs an add 
operation with an adjustment or conversion for a decimal 
format. Sometimes a carry is generated by the AAA 
instruction, which then writes a one to the Carry Flag bit 
(CF) at bit-position © in the EFLAGS register 21. This 
complex CISC AAA instruction is not supported by the 
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instruction decoder and signals an unsupported opcode 
exception, which causes emulation mode to be entered from 
CISC mode. An emulation routine of RISC instructions is 
executed to emulate the complex CISC instruction. This 
emulation routine contains separate add and ASCTI conver- 
sion instructions and a RISC Boolean instruction which sets 
bit 3 in CR7, corresponding to bit-position 0, the CF bit, in 
the EFLAGS register 21. The RISC Boolean instruction 
setting this bit may be a CR-register Boolean XNOR instruc- 
tion (creqv) that exclusive-NOR’s bit 3 to itself, with field 
CR7 as its sources and destination. Once the emulation 
routine is completed, CISC mode is again entered and 
execution of the CISC program resumes at the following 
CISC instruction. This following CISC instruction may be 
an add-with-carry (ADC) instruction which is not emulated 
but directly reads the carry flag CF set by the emulated AAA 
instruction. Because CISC EFLAGS register 21 and RISC 
CR register 20 are implemented as the same hardware 
register on the CPU die, updating the RISC register also 
updates the register seen by CISC programs. 

The emulation routine of RISC instructions, or other 
native RISC programs, may freely update bits in fields CRO 
and CR1, because these bits correspond to reserved bits in 
CISC EFLAGS register 21. At the conclusion of the emu- 
lation routine, before CISC mode is entered, these bits in 
CRO and CRI1 are cleared so that they are all read as zero 
when CISC mode instructions read CISC EFLAGS register 
21. 


BENEFITS AND USES OF MERGED FLAGS 
REGISTERS 


Folding CISC EFLAGS register 21 and RISC CR register 
20 together brings additional benefits besides cost reduction 
by having fewer registers on the CPU die. RISC programs 
can examine the flag bits in CISC EFLAGS register 21 to 
determine the results generated by the CISC program using 
the existing RISC instructions. No special RISC instructions 
are needed to examine this information from the CISC 
program. The RISC program may examine the zero flag to 
determine if the CISC program had a zero result, which 
might indicate the end of an iterative loop. The CISC zero 
flag (ZF) at bit position 6 may be examined by a RISC 
instruction simply by reading bit 1 of field CR6. Likewise, 
any of the other flag bits may be examined by a RISC 
program by reading the corresponding bit in the RISC CR 
register 20. Particularly with emulation routines, having this 
information is critical. Because the RISC instruction set has 
so many instructions which can access RISC CR register 20 
directly, the emulation routine may be efficiently pro- 
grammed without many move or load/store instructions to 
make available the CISC EFLAGS register 21. Thus the 
emulation routine has a much higher performance than if the 
CISC EFPLAGS register 21 had to be stored on a stack in 
memory and retrieved for the emulation routine to examine. 
Even moving a separate CISC EFLAGS register 21 from 
one CPU register into the RISC CR register 20 for use by 
RISC branch instructions require extra RISC instnictions, 
decreasing performance relative to the invention. 

The emulation routine can perform branches directly off 
the CISC flag bits. A RISC instruction in the emulation 
Toutine can branch off the CF or AF bits in the CISC 
EFLAGS register by merely specifying the corresponding 
bit in the RISC CR register. The CR register is the most 
visible and accessible state register in the PowerPC™ RISC 
architecture. The EFLAGS register in the x86 CISC archi- 
tecture is likewise the most interesting CISC register 
because of the many state flags stored in it. Using the RISC 
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CR register as a window into the CISC architecture provides 
a versatile and powerful tool. 

The RISC and CISC condition code and flags registers are 
effectively merged together into a single flags register that is 
accessible by instructions from both instruction sets. The 
merged register is special because it is not just explicitly 
accessible as a register, but the merged register is also 
accessible implicitly. Instructions that implicitly update the 
flags register, whether RISC or CISC instructions, update the 
same merged register. Because the two instruction sets tend 
to use separate portions of the merged register, each instruc- 
tion set can use its portion of the merged register, without 
interfering with the other instruction set. Yet programs 
running in one instruction set can still observe the flags set 
by programs in the other instruction set. Thus information 
about the results generated by one instruction set may be 
made available to programs in the other instruction set. 


- Information about the operating state of the x86 CISC 
program is also available by reading the merged flags 
register since control bits are stored in bit-positions 11 to 21. 
These control bits, shown in Table 3, include virtual 8086 
mode, interrupt enabling and privilege levels, an indication 
of task nesting, debug trapping, and data alignment check- 
ing. Again, a RISC program merely has to read the proper bit 
in the merged register, which appears as the standard RISC 
CR register to the RISC program. Often the register does not 
even have to be explicitly read by the RISC program, but 
only implicitly read. A RISC conditional branch instruction 
can be set to branch on the bit in CR 5 corresponding to the 
CISC interrupt enable control bit (IF). A RISC program 
could branch to a routine to check and disable interrupts if 
the IF bit is set, but continue without disabling interrupts if 
the bit is zero, knowing that interrupts are not possible. Thus 
the CISC interrupt enable bit is used to direct program flow 
in the RISC program merely by branching on the shared 
CISC/RISC bit. No register transfers, loads, or even explicit 
reads were necessary. The invention provides a very clean, 
simple, and efficient way to pass information between pro- 
grams running in two different instruction sets. 


FLOATING POINT REGISTERS COMBINED 


FIG. 4 is a diagram showing that the RISC floating point 
status and control register FPSCR 52 and the CISC floating 
point status register 57 and CISC floating point control 
register 55 may be combined. Both the RISC FPSCR 
register 52 and CISC floating point status register 57 contain 
status or flag bits which indicate when a floating point 
exception has occurred or when the result is zero, negative, 
etc. Both registers 52, 57 also contain one or more summary 
bits that indicate if any exception has occurred. This allows 
system software to read just the summary bit to determine if 
an exception has occurred. 

Both the RISC FPSCR register 52 and CISC floating point 
control register 55 contain control or enable bits which 
enable the reporting of various kinds of floating point 
exceptions. Control of rounding may also be controlled by 
rounding control bits in these registers. 

Since both the CISC and RISC floating point status and 
control registers perform similar functions, the inventors 
have realized that these registers may be combined into a 
single register accessible by both instruction sets. 

FIG. 5 shows in detail how RISC floating point status and 
control register FPSCR 52 and the CISC fioating point status 
tegister 57 and CISC floating point control register 55 are 
combined. Tables 7, 8, 9 show the bits in RISC FPSCR 
Tegister 52, and the corresponding bits in CISC floating point 
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status register 57 and CISC floating point control register 55. 
While the order of the bits in the CISC registers is not 
identical to the order of the corresponding bits in RISC 
FPSCR register 52, the CISC bits may nevertheless be 
mapped into RISC FPSCR register 52. When a CISC 
instruction reads or writes to the CISC registers 55, 57, the 
bits from RISC FPSCR register 52 must be re-ordered to the 
CISC order, either by a simple hardware mapping or by 
emulation software. 


TABLE 7 


CISC FP Status Register 


Bit Symbol Name Meaning 
15 B Error Summary Equal to ES bit 7 
14 C3 Condition Code Similar to EFLAGS condition codes 
13:11 TOP Stack Top Points to top of FP data stack 
10:8 C2:CO Condition Codes Similar to EFLAGS condition codes 
7 ES Error Summary _Set if any exceptions detected, even 
if not enabled 
6 SF Stack Flag Stack overflow or underflow 
5 PE Precision Flag Not precise result; rounding occurred 
4 UE Underflow Flag Result is too small to be represented 
3 OE Overflow Flag Result is too large to be represented 
2 ZE Zero-Divide Flag Divisor was zero 
1 DE Denormalized An operand is denormalized (operand 
Flag has smallest exponent but non-zero 
mantissa) 
(0) IE Invalid Op Flag Operation on a NaN, infinity, 
unsupported format, etc. 
TABLE 8 
CISC FP Control Register ; 
Bit Symbol Name Meaning 
11:10 RC Round Control = Rounding type and direction 
9:8 PC Precision Control Size of mantissa used 
5 PM Precision Mask = Enable mask for Precision 
Exception 
4 UM Underflow Mask Enable mask for Underflow 
Exception 
3 OM. Overflow Mask Enable mask for Overflow Exception 
2 ZM Zero-Divide Enable mask for Zero-Divide 
Mask Exception 
1 DM Denormalized Enable mask for Denormalized 
Mask Exception 
0 IM Invalid Op Mask Enable mask for Invalid Op 
Exception 
TABLE 9 
FPSCR Register for RISC and CISC 
RISC CISC 
Bit Symbol Symbol Name Meaning 
0 FX ES Error Summary Set if any exceptions 
detected, even if not 
enabled 
1 FEX Enabled Error Set if any enabled 
Summary exception is detected 
2 VX IE Invalid Op Flag Invalid Operation 
3 OX OE Overflow Flag Result is too large to be 
Tepresented 
4 UX Underflow Flag Result is too small to be 
Tepresented 
=] 2X ZE Zero-Divide Divisor was zero 
Flag 
6 KX PE Precision Flag Loss of precision; rounding 
occurred 
q VXSNAN NaN Flag NaN exception detected 
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TABLE 9-continued 


FPSCR Register for RISC and CISC 


RISC CISC 
Bit Symbol Symbol Name Meaning 
8 VXISI Subtract Exception detected by 
Infinity Flag subtracting infinity 
9 VXIDI Divide Infinity Exception detected by 
Flag dividing infinity 
10 VXZDZ Divide-by-Zero Exception detected by 
Flag dividing by zero 
11 VXIMZ Multiply Exception detected by 
Infinity x moultiplying infinity by zero 
Zero Flag 
12 VXVC *DE NaN Compare Exception detected. by 
Flag compare of NaN 
13 FR *PC Incremented Incremented intermediate 
Flag result 
14 FI *PC Round Flag Rounded intermediate result 
15 Cc Result Codes Indicates type of number 
tetumed 
16 «FL co Result Codes Indicates type of number 
retumed 
17 FG C1 Result Codes Indicates type of number 
returned 
18 FE C3 Result Codes Indicates type of number 
retummed. 
19 FU C2 Result Codes Indicates type of number 
returned. 
20 Reserved 
21 VXSOFT S/W Exception S/W Requested Exception 
Flag 
22 VXSQRT SQRT Flag Tnvalid square-root 
exception detected 
23 VXCVI *DM IntegerNaN Integer convert of NaN 
Flag exception detected 
24 «VE IM Invalid Op Enable mask for Invalid Op 
Mask Exception 
25 OE OM Overflow Enable mask for Overflow 
Mask Exception 
26 4UE UM Underflow Enable mask for Underflow 
Mask Exception 
27 ZE ZM Zero-Divide Enable mask for Zero- 
Mask Divide Exception 
28 XE PM Precision Mask Enable mask for Precision 
Exception 
28 4 =XE PM Precision Mask Enable mask for Precision 
Exception 
29 NI NI TEEE Mode Bit 1=CISC FP Mode, 0=RISC 
30:31 RN RC Round Control Rounding type and 
direction 


While most status and control bits overlap for the two 
instruction sets, a few CISC bits have no corresponding 
RISC bit. Thus some of the RISC bits in the FPSCR register 
have an alternate, unrelated function for CISC mode. These 
CISC bits are PC, DE, and DM, which are marked in Table 
9 with asterisks to indicate that the RISC functionality 
shown does not apply for CISC mode. For example, bit 23 
is VXCVI in RISC mode, which is the integer NaN flag, 
which is set when a conversion of NaN (not a valid number) 
to an integer is attempted. However in CISC mode, bit 23 is 
DM, Denormalized Mask which enables reporting of Denor- 
malized Exceptions. 

The CISC precision control PC, bits 13, 14, are shared 
with the RISC FR, FI bits which indicate the type of 
rounding last performed. In CISC mode these bits 13, 14 are 
control bits rather than status bits, controlling the precision 
(either 24, 53, or 64-bit mantissas). 

For the rounding control, bits 30, 31, the RISC encoding 
is used for both CISC and RISC mode: 


00=Round to even if tie, else round to nearest. 
01=Round to zero. 
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10=Round to positive infinity. 

11=Round to negative infinity. 

When the IEEE Mode Bit is set, very small numbers that 
are smaller than the smallest normalized number are 
rounded to zero. However, when the IEEE bit is cleared, 
these very small numbers are not rounded to zero but are left 
as denormal numbers. Other bits in the CISC floating point 
status and control registers are contained in registers in the 
instruction decode unit. For example, the stack-top pointer, 
TOP, and the stack fault bits SF are not included in the 
FPSCR register. 

Thus the two CISC floating point status and control 
registers may be folded into the single RISC floating point 
status and control register FPSCR. This allows RISC float- 
ing point instructions to read the status information from 
CISC floating point instructions without having to physi- 
cally move the CISC status bits from a CISC floating point 
status register to another register. 


ADDRESS GENERATION REGISTERS 
COMBINED 


Other registers may also be folded together or combined. 
The RISC count register (CTR) is a 32-bit register that 
contains a loop count that can be decremented when a 
branch instruction is executed. It can be explicitly accessed 
by some RISC move instructions, and can be implicitly 
accessed by certain RISC branch instruction which cause the 
CTR register to be read and decremented. 


One of the CISC segment registers (17 of FIG. 2) holds 
the base address of the code segment. This code segment 
Tegister is needed to generate addresses for fetching instruc- 
tions when in CISC mode. The base address in the code 
segment register is also needed to calculate the targets of a 
branch instruction. Thus the code segment register is 
accessed frequently. 

The RISC CTR register and the CISC code segment 
registers may be combined together in the dual-instruction- 
set processor. The combined register holds the CISC code 
segment base address during CISC mode. The code segment 
base address is left in the combined CTR/CS register when 
RISC mode is entered. The code segment base address is 
restored by the emulation routine to the combined CTR/CS 
register before CISC mode is re-entered. 

Since the CTR register is infrequently used, the code 
segment can remain in the combined CTR/CS register most 
of the time, even during RISC mode. RISC emulation 
routines may be programmed that do not use the CTR 
register, thus increasing performance of the emulation rou- 
tine. Since both the CISC code segment register and the 
RISC count register are needed by the branching unit, 
merging these into the same register provides a single shared 
register to supply both CISC and RISC address information 
to the branching unit. 


RETURN ADDRESS REGISTERS COMBINED 


The RISC machine status save restore 0 register (SRRO) 
3 of FIG. 1 saves the effective address of the instruction 
following the instruction causing an exception, or the effec- 
tive address of the instruction following a system call 
instruction. When the exception handler routine completes 
and a return-from-interrupt (rfi) instruction is executed, the 
address that was stored in SRRO is reloaded into the instruc- 
tion pointer so that program execution can continue with the 
next instruction. Thus the SRRO register provides a place for 
the address of a RISC instruction that occurs after a RISC 
instruction causing an exception. 
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In the dual-instruction-set processor, when a CISC pro- 
gram causes an exception, a RISC emulation routine is 
called and executed. Thus the normal CISC exception han- 
dling hardware is not needed and no counterpart to the SRRO 
register is necessary. The CISC CR2 register normally holds 
the address of an instruction causing a page fault in a 
CISC-only processor. This CR2 register is not needed in the 
dual-instruction-set processor since page faults, like other 
exceptions, are all handled by RISC emulation code. Other 
CISC exceptions cause the instruction pointer (IP) to be 
pushed on the stack in memory by the exception handier. 
Pushing the IP to the stack is not performed in micro-code, 
as in prior-art x86 processors, but by the emulation routine 
for the dual-instruction-set processor. Any time emulation 
mode is entered from CISC mode, regardless of the cause, 
the address of the CISC instruction being executed is stored 
into the RISC SRRO register. When the emulation routine 
completes, the processor switches back to CISC mode, the 
address stored in register SRRO is re-loaded into the instruc- 
tion pointer, and the next instruction in the CISC program is 
fetched and executed. 

If emulation mode is entered to handle an exception, then 
SRRO should point to the CISC instruction causing the 
exception, so that CISC instruction can be re-started once 
the exception handling is complete. When the emulation 
routine determines the size of the CISC instruction, the size 
is added to the address stored in the SRRO register to get the 
address of the next CISC instruction. This addition must 
occur because CISC instructions can vary in size, CISC 
instructions being 1-15 bytes in size. Thus SRRO causes the 
CISC program to continue at the instruction following the 
CISC instruction being emulated, unless an exception 
occurs. 


The RISC SRRO register, which normally holds the 
address of an instruction causing an exception, is also used 
to hold the address of a CISC instruction that caused 
emulation mode to be entered because the CISC instruction 
was not supported in hardware but had to be emulated. These 
are two parallel uses, but for two different instruction sets. 
The additional hardware to support both of these functions 
is minimal because these functions are closely related. 


LR AND FP REGISTERS COMBINED 


The RISC link register (LR) provides a branch target 
address for a RISC branch conditional to link instruction. It 
is a 32-bit register. While most RISC branch instructions do 
not use the link register, some do, such as the RISC 
branch-and-link instruction. 

The CISC architecture requires that the address of a 
floating point instruction be saved. Since floating point 
operations may take several clock cycles to complete, sev- 
eral simple integer instructions could have completed execu- 
tion by the time an exception is signaled that was caused by 
the floating point instruction. Storing the address of the 
floating point instruction allows the exception handling 
routine to backtrack the code and sort out the integer 
instructions executed.. This address of the floating point 
instruction is stored in the FP IP register in the CISC 
architecture. In the dual-instruction-set processor this 
address is instead stored in the RISC link register. 

Storing the CISC floating point instruction’s address in 
the RISC link register may cause a problem if a RISC 
program contains a RISC instruction that uses the link 
register. If that happens, the CISC floating point instruc- 
tion’s address must be saved to a stack in memory or to 
another general-purpose register. It is believed that this is an 
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infrequent occurrence and therefore the cost savings of 
combining the registers justifies sharing these two registers. 
Most of the time the RISC link register is not used, so no 
conflict occurs. 

Since the link register is used to store the address of the 
RISC target, a path for an instruction address is already 
provided to this register. Thus the floating point instruction’s 
address may also use this instruction path to the shared LR 
register. 


GPR’S COMBINED 


There are 32 general-purpose registers (GPR’s) for RISC 
which may be explicitly read or written by integer RISC 
instructions. The x86 CISC architecture provides only 8 
general-purpose registers which can be read or written by a 
user program. CISC also provides 6 segment registers which 
contain segment base addresses that are used to calculate the 
linear address of code, data operands, and a stack. These 
segment registers can only be used for segmentation, a part 
of address generation, and have restrictions on reading and 
writing them with CISC instructions. Only x86 privileged or 
segment-load instructions can read or write the segment 
tegisters. All of these instructions are emulated by the 
dual-instruction-set processor. Thus CISC-mode instruc- 
tions cannot directly read or write these segment registers. 

These 32 RISC general-purpose registers may be merged 
with the 8 CISC general-purpose registers and the 6 CISC 
segment registers. Table 10 shows how these registers are 


used for RISC, CISC, and emulation mode. 
TABLE 10 
Shared GPR’s 
RISC Mode CISC Mode Emulation Mode 
GPRO EAX GPR EAX GPR 
GPR1 ECX GPR ECX GPR 
GPR 2 EDX GPR EDX GPR 
GPR 3 EBX GPR EBX GPR 
GPR 4 ESP GPR ESP GPR 
GPR 5 EBP GPR EBP GPR 
GPR 6 ESI GPR ESI GPR 
GPR 7 EDI GPR EDI GPR 
GPR 8 ES Seg Base ES Seg Base 
GPR 9 CS Seg Base CS Seg Base 
GPR 10 SS Seg Base SS Seg Base 
GPR 11 DS Seg Base DS Seg Base 
GPR 12 FS Seg Base PS Seg Base 
GPR 13 GS Seg Base GS Seg Base 
GPR 14 NIA Emulation Base Address 
GPR 15 NIA O Base Address 
GPR 16 NA GPR 16 
GPR 17 NA GPR 17 
GPR 18 N/A GPR 18 
GPR 19 NA GPR 19 
GPR 20 NA GPR 20 
GPR 21 NIA GPR 21 
GPR 22 NIA GPR 22 
GPR 23 NA GPR 23 
GPR 24 N/A GPR 24 
GPR 25 NIA GPR 25 
GPR 26 NA GPR 26 
GPR 27 NA GPR 27 
GPR 28 N/A GPR 28 
GPR 29 NA GPR 29 
GPR 30 N/A Emulation Assist Address 
GPR 31 NA Emulation Assist Data 


Table 10 shows that in RISC mode, the 32 general- 
purpose registers are accessible as true general-purpose 
registers. Any of the 32 registers may be read or written by 
RISC programs. In CISC mode, there are only 8 general- 
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purpose registers, EAX through EDI, which share the same 
physical registers with the RISC GPRs 0 to 7. A CISC 
program may load one of its GPR’s, such as EAX, with a 
data value, then switch to RISC mode, allowing the RISC 
program to read that value the CISC program placed in EAX 
merely by reading the RISC GPR 0. Since the RISC archi- 
tecture defines GPR 0 as a regular GPR, many instructions 
can access this register without an explicit load from 
memory or register-to-register transfer. For example, a RISC 
add instruction could specify the value in GPR 0 that was 
loaded by the CISC program merely by identifying GPR 0 
in one of the source fields in the RISC ADD instruction 
word. The result of the ADD may be written back to GPR 0 
or any other GPR. Thus no explicit transfer is needed to 
access the CISC data by the RISC program. 

The 6 CISC segment base address registers, ES Base to 
GS Base, may be implicitly read by the CISC program when 
generating an address. A RISC program may read or write 
these registers merely by specifying the corresponding GPR. 
If the CISC program required emulation code to load 
segment register FS Base with a base address, then the RISC 
emulation program merely has to identify GPR 12 in a RISC 
instruction to read or write this base address. 

CISC programs, however, cannot freely access any RISC 
GPR except GPR 0 to 7. CISC programs may access GPR 
8 to 13 in a restricted way, since these registers correspond 
to the CISC segment registers ES Base to GS Base. The 
CISC program may use a special segment override in the 
CISC instruction word to access one of these segment base 
registers when calculating an address. The CISC architec- 
ture imposes limitations on accessing these segment 
registers, making them useful for transferring address infor- 
mation between the RISC and CISC programs, but not 
useful for transferring data. The CISC program can only 
implicitly access these segment registers for address gen- 
eration. 

CISC mode programs have no access to RISC GPR 14 to 
31, since there is no corresponding CISC register. However, 
emulation mode can access all 31 RISC registers, including 
the first 8 registers, which are the CISC GPR’s, and the 6 
CISC segment base registers. However, emulation mode can 
freely access the CISC segment base registers. Emulation 
mode executes RISC instructions, so the mechanism to 
transfer data and address information between CISC and 
emulation modes is similar to transfers between CISC and 
RISC modes as described above. The RISC instructions in 
emulation mode can implicitly access a CISC register by 
identifying the corresponding RISC GPR as a source in the 
RISC instruction word, 

Emulation mode differs slightly from RISC mode. Nor- 
mal RISC address checking and page fault handling is 
performed for accesses to most registers. However, when 
any of the 6 CISC segment registers, or GPR 14 or 15, are 
used to generate an address, the CPU uses x86-type address 
checking and x86 page fault handling, rather than the normal 
RISC address checking and page faulting routines. 

GPR 14 and 15 are used in emulation mode as special 
emulation-mode segment base address registers. When emu- 
lation code generates an address using GPR 15 as one of the 
operands, no segment validity checking is performed at all, 
neither RISC nor x86 segment validity checking. Using 
GPR 15 allows for emulation code to generate an address 
without any segment checking. When emulation mode is 
entered, a 3-bit register is loaded with a pointer to one of the 
6 CISC segment base registers (GPR 8 to GPR 13). The 
pointer value loaded is the segment used by the CISC 
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instruction being emulated. This is normally the data seg- 
ment register DS, or stack segment register SS, but a 
segment over-ride prefix attached to the CISC instruction 
could indicate that one of the other segment base registers be 
used. When register 14 is read the segment checking rules 
are determined by the last segment used by the last CISC 
mode memory access. The last segment used one of the 
segments in registers 8 to 13. This allows emulation mode to 
use CISC mode checking, but also have full control and 
access of the segment registers. Thus using GPR 14 allows 
the emulation code to generate an address using whichever 
’ CISC segment register has been used by the CISC instruc- 
tion. This is a very powerful feature for emulation, saving 
dozens of instructions in the emulation routine to examine 
and decode the CISC instruction word to determine which 
segment register should be used. 

Both registers GPR 14 and 15 are preferably loaded with 
the value zero so that they do not modify the address being 
generated. Thus using these registers in emulation mode 
alters the address checking being used. Emulation mode, 
although using RISC instructions, can have CISC address 
checking for certain addresses generated with the CISC 
segment base registers and the two emulation base registers 
GPR 14 and 15. 

Some of the RISC GPR’s may be used by emulation mode 
for particular purposes. For example, GPR’s 14, 15, 30, and 
31 may be used by emulation routines for address generation 
within emulation mode, and for various assist functions. 

If RISC mode and emulation mode are to be both used at 
the same time on a system, then the RISC program should 
not overwrite the four special emulation mode registers, or 
the CPU hardware needs to provide two sets of registers for 
GPR 14, 15, 30, and 31, one set exclusively for RISC mode, 
with a second set exclusively for emulation mode, as well as 
other duplicated hardware. RISC-mode programs must also 
not overwrite GPR’s 0-13, which are used for CISC mode 
architectural registers. Because general RISC user programs 
write these registers, and may not have a need for transfer- 
ring data to a CISC program, a process or task switch from 
a CISC user program to a general RISC user program is 
handled as a normal task switch, with all registers being 
saved to a stack before the switch so that the values in the 
GPR’s are not overwritten and lost. 

The code segment base address is available in two sepa- 
rate registers: GPR 9 holds the CISC code segment base (CS 
Base) while the RISC count CTR register also holds this 
same code segment base. This is beneficial for modern 
pipelined and superscalar processors because the CTR/CS 
register can provide the branching unit with the code seg- 
ment base address, while the GPR array also can provide the 
code segment base address to the execution unit. Thus the 
code segment base address may be provided from two 
separate registers to two separate units within the processor. 
Since these units are often separated, having the separate 
registers can save the delay in transferring the code segment 
from the GPR’s to the branching unit. As the code segment 
base address is frequently used in address calculations, 
having it in two separate locations is useful, effectively 
doubling the available bandwidth for supplying this base 
address. 

Merging the GPR’s together with the CISC GPR and 
segment registers provides a very efficient and clean way of 
transferring address and data between CISC and RISC 
programs and emulation programs. Normal architectural 
features are used to access and transfer data. Data can be 
accessed explicitly by specifying the corresponding GPR as 
the source in the instruction word. 
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FLOATING POINT DATA REGISTERS 
COMBINED 


Just as the general-purpose registers for integer data are 
combined for both RISC and CISC integer instructions, the 
floating point data registers are also combined for RISC and 
CISC floating point instructions. The advantages of allowing 
programs from both RISC and CISC instruction sets fo 
access the same data registers can thus apply to either or 
both integer and floating point data formats. The more 
complex CISC floating point instructions can be emulated 
with several simpler RISC floating point instructions which 
operate on the same set of floating point data registers and 
set flags in the same floating point status register. Thus 
switching between instruction sets can be accomplished 
quickly without saving the architectural registers out to main 
memory. 

Indeed, emulation of the more complex floating point 
instructions may be more beneficial than emulation of 
simpler integer instructions because of the extreme com- 
plexity of some of the complex floating point instructions. 
The more complex CISC floating point instructions include 
square root (FSQRT), and transcendental operations such as 
sine (FSIN), cosine (FCOS), partial tangent and arc-tangent 
(FPTAN, FPATAN), and several logarithmic operations. The 
complexity of these floating point instructions is much 
greater than that of simpler integer instructions. Thus emu- 
lation may be more desirable for floating point instructions 
than for integer instructions. The invention allows emulation 
to be performed with a second instruction set, such as a 
highly-optimized RISC instruction set. 


PAIRING OF RISC FLOATING POINT DATA 
REGISTERS FOR CISC EXTENDED PRECISION 

While integer data formats are typically 32-bits in width 
or size, floating point formats include more bits of precision 
by having mantissas of 53 or 64 bits. The range of these 
floating point formats is also increased by having exponents 
of 8, 11, or 15 bits. Standard or single-precision floating 
point uses 32 bits, with an 8-bit exponent and a 24-bit 
mantissa. The double-precision floating-point format uses 
64 bits, with an 11-bit exponent and a 53-bit mantissa. RISC 
includes both single and double precision floating point 
formats. CISC also includes these formats, in addition to a 
third format, extended precision. Extended precision uses 80 
bits, with a 15-bit exponent and a 65-bit mantissa, including 
one sign bit. The exponent is biased for single and double 
precision formats so a separate sign bit for the exponent is 
not needed. 

If the 80-bit extended precision format is supported for 
CISC, then the 64-bit RISC floating point data registers 
could be extended to 80 bits. However, this adds expense as 
16 bits of register cells must be added to each floating point 
data register. Since the x86 CISC architecture defines only 
eight floating point registers, one solution is to extend to 80 
bits just eight of the thirty-two RISC floating point data 
registers. 

The inventors have realized than none of the 32 RISC 
floating point data registers need to be extended to 80 bits. 
Instead, two 64-bit RISC registers may be paired to produce 
an 80-bit CISC format. Since only 8 registers are needed for 
CISC instructions, pairing two RISC registers for each CISC 
register requires 16 of the 32 RISC floating point data 
registers. Thus RISC programs still have 16 unused floating 
point data registers, while 16 registers are paired for the 8 
CISC 80-bit-format registers. 

FIG. 6 is an implementation of RISC floating point data 
register pairing for CISC extended precision. The merged 
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CISC & RISC floating point data register file 60 includes 32 
floating point registers, each having 64 bits. These 64 bits in 
each register are divided into 53 mantissa bits and 11 
exponent bits for double precision. 

When the CISC 80-bit extended precision is needed, 
64-bit RISC registers are paired, which provides ample bits 
(128 bits) for the 80-bit format. The additional 48 bits are 
unused, although other information such as full, empty, or 
stack information could be stored in these 48 unused bits. 
The first eight registers are paired with the second eight 
registers to form the eight CISC 80-bit registers in the stack. 
For example, register 0 is paired with register 8 register 1 
with register 9, register 7 with register 15. 

FIG. 6 shows a read from the first register in the 8-register 
stack. Register 0 provides 64 bits, including 53 mantissa bits 
and 11 exponent bits. The paired register, register 8, provides 
4 more bits of the exponent and 11 more bits of the mantissa. 
The 4 additional exponent bits are stored in the upper 4 bits 
of register 8, while the 11 additional mantissa bits are stored 
in the lowest 11 bits of register 8. The 4 additional exponent 
bits from register § are appended to the top (left, or most- 
significant) of the exponent bits from register 0, while the 11 
additional mantissa bits from register 8 are appended to the 
bottom (right, or least-significant) of the 53 mantissa bits 
from register 6. This arrangement minimizes the physical 
routing of metal traces on a semiconductor chip, thus 
reducing cost. The final 80-bit floating point operand is 
assembled, and the leading one of the mantissa is inserted 
and/or generated before being input to floating point adder 
64, or another floating point operation unit such as a 
multiplier or logic unit (not shown). A second operand is 
assembled from two other paired floating point registers in 
a manner similar to that shown for registers 0 and 8. 

Since the x86 CISC architecture defines only two input 
operands, four 64-bit read ports are needed to floating point 
data register file 60. Since PowerPC™ RISC defines 3 input 
operands, one additional read port is needed, for a total of 
four read ports. Likewise, two 64-bit write ports are needed 
to floating point data register file 60 to support the writing 
of an 80-bit result. FIG. 7 shows floating point data register 
file 60 having two write and four read ports. Multi-port 
register files are well-known in the art and can be con- 
structed from memory cells with multiple ports. 


SHARED REGISTER ARCHITECTURE 


FIG. 8 is a diagram of the shared registers in the dual- 
instruction set processor. The CISC EFLAGS register and 
the RISC CR register are combined into a single 32-bit 
CP/EFLAGS register 40 that can be accessed by CISC user 
programs and RISC user programs and emulation code. The 
CISC code segment base address register and the RISC 
count CTR register are merged to a single CS/CTR register 
42, also accessible by CISC user programs and RISC user 
programs and emulation code. The RISC system save/ 
restore (SRRO) register, which normally holds the address to 
return to after an interrupt has been processed, also holds the 
return address when emulation code was called. Thus SRRO 
can hold a RISC address or a CISC address. The RISC or 
CISC user can indirectly load SRRO register 44 by causing 
emulation code to be entered, but cannot directly access 
SRRO register 44. However, RISC supervisor code and 
emulation code has full access to SRRO register 44. 

The RISC link register, which is used to hold a branch 
address, is combined with a CISC register that holds the 
instruction address of the most recent floating point instruc- 
tion. This merged FP-IP/LR register 46 is also indirectly 
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accessible by CISC programs because they cannot directly 
read or write it, but can only load it by executing a floating 
point instruction. RISC and emulation programs can freely 
access this merged FP-IP/LR register 46. The 32 general- 
purpose registers from RISC are merged with the 8 GPR’s 
and 6 segment base registers from the CISC architecture into 
merged GPR’s 48. Four of the RISC GPR’s are used by 
emulation code for special uses, although emulation code 
can access all RISC and CISC registers. 

Data registers for numbers in floating-point format are 
also merged into one set of floating-point data register file 60 
for both RISC and CISC programs. Status and control bits 
for floating point operations in both RISC and CISC modes 
are combined into a single floating point status and control 
register FPSCR register 52. 


ALTERNATE EMBODIMENTS 


This improvement relates to a central processing unit 
(CPU) for a dual-instruction set architecture. While the 
detailed description describes the invention in the context of 
the PowerPC™ reduced instruction set computer (RISC) 
and the x86 complex instruction set computer (CISC), it is 
contemplated that the invention applies to other instruction 
sets besides PowerPC™ and x86, and to more than two 
instruction sets, and to architectures besides RISC and 
CISC, without departing from the spirit of the invention. The 
exact number of bits in each register may likewise be varied 
by persons skilled in the art without departing from the spirit 
of the invention, although architecture compatibility may be 
lost. 

The foregoing description of the embodiments of the 
invention has been presented for the purposes of illustration 
and description. It is not intended to be exhaustive or to limit 
the invention to the precise form disclosed. Many modifi- 
cations and variations are possible in light of the above 
teaching. It is intended that the scope of the invention be 
limited not by this detailed description, but rather by the 
claims appended hereto. 

We claim: 

1. A shared register system for a dual-instruction-set 
fioating point processor, the shared register system compris- 
ing: 

a shared floating point register for storing information to 
be transferred between a first program comprised of 
floating point instructions from a CISC instruction set 
and a second program comprised of floating point 
instructions from a RISC instruction set, the CISC 
instruction set having a first encoding of operations to 
opcodes, the RISC instruction set having a second 
encoding of operations to opcodes, the first encoding of 
operations to opcodes being substantially independent 
from the second encoding of operations to opcodes; 

first means, coupled to the shared floating point register, 
for accessing the shared floating point register from the 
CISC instruction set, the first means writing informa- 
tion into the shared floating point register responsive to 
a first subset of instructions from the CISC instruction 
set; and 

second means, coupled to the shared floating point 
register, for accessing the shared floating point register 
from the RISC instruction set, the second means read- 
ing information from the shared floating point register 
responsive to a second subset of instructions from the 
RISC instruction set, 

whereby information is transferred from the first program 
to the second program using the shared floating point 
register. 
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2. The shared floating point register system of claim 1 
wherein the shared floating point register is in a plurality of 
floating point data registers in the dual-instruction-set float- 
ing point processor, each register in the plurality of floating 
point data registers for storing a mantissa portion and an 
exponent portion of a number represented in a floating-point 
format. 

3. The shared floating point register system of claim 2 
wherein the plurality of fioating point data registers include 
eight stack-accessible CISC floating point data registers. 

4. The shared floating point register system of claim 2 
wherein the plurality of floating point data registers store 
fioating point data in a RISC format, the shared register 
system further comprising: 

pairing means for accessing a pair of floating point data 
tegisters and for combining data from both paired 
registers to produce an operand in an extended- 
precision format, 

whereby an operand in the extended-precision format is 
accessed by pairing two registers for storing floating 
point data in the RISC format. 

§. The shared floating point register system of claim 4 

wherein the paired registers comprise: 

a first register having a first exponent portion and a first 
mantissa portion, the first exponent portion and the first 
mantissa portion containing sufficient exponent and 
mantissa bits to store a number in a double-precision 
floating point format; 

a second register having an extended exponent portion 
and an extended mantissa portion; 

wherein the pairing means comprises: 

exponent-extension means, receiving the first exponent 
portion from the first register and receiving the 
extended exponent portion from the second register, for 
prefixing the extended exponent portion to the first 
exponent portion, wherein the extended exponent por- 
tion comprises the most-significant bits while the first 
exponent portion comprises the least significant bits of 
an exponent for the operand in the extended-precision 
format; 

mantissa-extension means, receiving the first mantissa 
portion from the first register and receiving the 
extended mantissa portion from the second register, for 
appending the extended mantissa portion to the first 
mantissa portion, wherein the extended mantissa por- 
tion comprises the least-significant mantissa bits while 
the first mantissa portion comprises the most significant 
mantissa bits of a mantissa for the operand in the 
extended-precision format, 

whereby the pairing means prefixes exponent bits but 
appends mantissa bits from the second register to form 
the operand in the extended precision format. 

6. The shared floating point register system of claim 1 
wherein the RISC instruction set is a PowerPC™ instruction 
set, and the CISC instruction set is an x86 instruction set. 

7. The shared floating point register system of claim 1 
wherein the shared floating point register comprises a first 
flags field for storing first flags implicitly set by floating 
point operations encoded by opcodes in the first subset of 
instructions from the CISC instruction set, the first flags also 
implicitly set by floating point operations encoded by 
opcodes in a third subset of instructions from the RISC 
instruction set, the second means for accessing the shared 
floating point register from the RISC instruction set writing 
information to the shared floating point register in response 
to instructions from the third subset of instructions from the 
RISC instruction set. 
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8. The shared floating point register system of claim 7 
wherein the first flags field in the shared floating point 
register is implicitly read by first instructions having 
opcodes encoding conditional branch operations, and 
wherein the second flags field in the shared floating point 


register is implicitly read by second instructions having 


opcodes encoding conditional branch operations. 

9. The shared floating point register system of claim 8 
wherein the first flags include a zero flag indicating that a 
floating point operation had a zero-valued result. 

10. The shared floating point register system of claim 9 
wherein the first flags include an exception flag indicating 
that an exception occurred when executing a floating point 
instruction. 

11. The shared floating point register system of claim 10 
further comprising: 

control bits in the shared floating point register, the 

control bits for enabling reporting of various kinds of 
floating point exceptions; 

exception detection means for detecting various kinds of 

floating point exceptions when a floating point instruc- 
tion is executed; 

exception disabling means, responsive to the control bits 

in the shared floating point register, for disabling the 
exception detection means and preventing the reporting 
of an exception and setting the exception flag in the first 
flags, 

whereby the control bits determine if the exception flag in 

the first flags is set when the exception occurs. 

12. The CPU of claim 11 wherein the exception is selected 
from the group consisting of an overflow exception, and 
underflow exception, a divide-by-zero exception, and a 
loss-of-precision exception. 

13. A floating point unit (FPU) for executing first instruc- 
tions from a first instruction set and for executing second 
instructions from a second instruction set, the first instruc- 
tions having a first field for specifying a destination floating 


point register on the FPU, the second instructions having a 


second field for specifying a source floating point register on 
the FPU, the FPU comprising: 

a first instruction decoder, receiving the first instructions 
from the first instruction set, the first instruction 
decoder providing decoded first instructions; 

a second instruction decoder, receiving the second 
instructions from the second instruction set, the second 
instruction decoder providing decoded second instruc- 
tions; 

a floating point execution unit for executing first floating 
point instructions and for executing second floating 
point instructions, the floating point execution unit 
receiving decoded first instructions from the first 
instruction decoder, the floating point execution unit 
receiving decoded second instructions from the second 
instruction decoder; and 

a plurality of floating point registers on the FPU, a 
selected register in the plurality of floating point reg- 
isters being written to by the floating point execution 
unit when the floating point execution unit receives a 
decoded first instruction, the selected register specified 
by the first field for specifying a destination register on 
the FPU, 

the selected register in the plurality of floating point 
registers being read from by the floating point execu- 
tion unit when the floating point execution unit receives 
a decoded second instruction, the selected register 
specified by the second field for specifying a source 
floating point register on the FPU, 
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whereby data may be transferred from a first instruction to 
a second instruction via the selected register. 

14. The FPU of claim 13 wherein the first instruction set 
has a first encoding of operations to opcodes, the second 
instruction set has a second encoding of operations to 
opcodes, the first encoding of operations to opcodes being 
substantially independent from the second encoding of 
operations to opcodes. 

15. The FPU of claim 14 wherein the second instruction 
set is a reduced instruction set computer (RISC) instruction 
set and the first instruction set is a complex instruction set 
computer (CISC) instruction set. 

16. A central processing unit (CPU) for executing first 
instructions from a first instruction set and for executing 
second instructions from a second instruction set, the CPU 
comprising: 

a first instruction decoder, receiving the first instructions 

from the first instruction set, the first instruction 
decoder providing decoded first instructions; 


a second instruction decoder, receiving the second 
instructions from the second instruction set, the second 
instruction decoder providing decoded second instruc- 
tions; 

an execution unit for executing first instructions and for 
executing second instructions, the execution unit 
receiving decoded first instructions from the first 
instruction decoder, the execution unit receiving 
decoded second instructions from the second instruc- 
tion decoder; and 

a condition code register comprising a first condition code 
and a second condition code, the first condition code 
being set by the execution unit when the execution unit 
receives a decoded first instruction and an arithmetic 
operation is executed, the second condition code being 
set by the execution unit when the execution unit 
receives a decoded second instruction and an arithmetic 
operation is executed, 


the first condition code being read by the execution unit 
when the execution unit receives a decoded first 
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instruction having a first opcode indicating that the first 
condition code be read; 


the first condition code also being read by the execution 
unit when the execution unit receives a decoded second 
instruction having a second opcode indicating that the 
first condition code be read; 


the second condition code being read by the execution 
unit when the execution unit receives a decoded second 
instruction having a third opcode indicating that the 
second condition code be read; 

a floating point status and control register, having control 
bits for enabling detection of floating point exceptions 
by the execution unit, the floating point status and 
control register further having status bits for indicating 
when an enabled exception has been detected by the 
execution unit when executing a floating point instruc- 
tion in the first instruction set or in the second instruc- 
tion set, 


whereby the first condition code set by execution of the 
first instruction set may be read by the first instruction 
set or the second instruction set. 

17. The CPU of claim 16 wherein the first opcode 
designates a floating point operation that implicitly writes 
the first condition code, the second opcode and. the third 
opcode encoding operations for a conditional branch opera- 
tion that reads the condition code register to determine if a 
branch is taken. 

18. The CPU of claim 17 wherein the first instruction set 
has a first encoding of operations to opcodes, the second 
instruction set has a second encoding of operations to 
opcodes, the first encoding of operations to opcodes being 
substantially independent from the second encoding of 
operations to opcodes. 

19. The CPU of claim 16 wherein the enabled exception 
is selected from the group consisting of an overflow 
exception, and underflow exception, a divide-by-zero 
exception, and a loss-of-precision exception. 
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